Drop old quantization flows #3115

namgyu-youn · 2025-10-02T06:39:06Z

Summary:
This PR drops old quantization flows:

torchao/quantization/subclass.py
torchao/quantization/weight_only.py
torchao/quantization/dynamic_quant.py

Applies corresponding change in other parts for removing old quantization flows:

torchao/quantization/quant_api.py
torchao/quantization/utils.py
torchao/quantization/autoquant.py

# Remove old test cases
test/integration/test_integration.py

# SmoothQuant with old workflow is dropped
# Use torchao/prototype/smoothquant instead with new api
torchao/quantization/smoothquant.py

fix: Deprecate and remove subclass.py, dynamic_quant.py, and weight_only.py #2745

Test plan: test/integration/test_integration.py

pytorch-bot · 2025-10-02T06:39:11Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3115

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 6 Pending

As of commit 04ea5f8 with merge base c96f2dd ():

NEW FAILURE - The following job has failed:

Code Analysis with Ruff / build (3.9) (gh)
test/quantization/test_quant_api.py:63:5: F811 Redefinition of unused IntxUnpackedToInt8Tensor from line 38

This comment was automatically generated by Dr. CI and updates every 15 minutes.

namgyu-youn · 2025-10-02T16:43:45Z

benchmarks/benchmark_aq.py

How about inlining _int8wo_api, _int8da_int8w_api, _int4wo_api ? They are used only once across codebase.

yeah I think that's fine if they're only used in benchmarks

also cc @jainapurva, can you take a look at the benchmark changes?

namgyu-youn · 2025-10-02T16:44:36Z

benchmarks/benchmark_aq.py

    print("_int8da_int8w_api")

    for M, N, K in all_shapes:
        _bench_quantized_tensor_subclass_perf(


Temporarily updated to use new APIs 2 times to fix CI, but maybe we can update _bench_quantized_tensor_subclass_perf to compare only original vs. new quantization flows?

andrewor14

Hi @namgyu-youn, thanks for working on this. I think it looks good overall, but seems like we removed some things outside the scope of #2745, like smoothquant. Can you please add these back?

andrewor14 · 2025-10-07T15:44:06Z

benchmarks/benchmark_aq.py

yeah I think that's fine if they're only used in benchmarks

andrewor14 · 2025-10-07T15:47:33Z

torchao/quantization/smoothquant.py

@@ -1,266 +0,0 @@
-# Copyright (c) Meta Platforms, Inc. and affiliates.


Hey @namgyu-youn I don't think we want to remove smoothquant? This wasn't part of the issue: #2745. Can you add these back?

andrewor14 · 2025-10-07T15:48:11Z

test/integration/test_integration.py

    return wrapper


-class SmoothquantUnitTest(unittest.TestCase):


Same here, please add the smoothquant tests back

In SmoothQuant API side, they (torchao/quantization/smoothquant.py & torchao/quantization/prototype/smoothquant/) are just a two different implementations, so we can revert them.

But in tests side, because we dropped

from torchao.quantization.subclass import ( Int4WeightOnlyQuantizedLinearWeight, Int8DynamicallyQuantizedLinearWeight, Int8WeightOnlyQuantizedLinearWeight, )

,
tests (SmoothquantUnitTest) can't be maintained. So my suggestion is, how about dropping the old SmoothQuant API also in this PR? The new API also resolved #1639 and has better structure I think

andrewor14 · 2025-10-07T15:49:09Z

torchao/prototype/quantization/autoquant_v2.py

-    Int8DynamicallyQuantizedLinearWeight,
-    Int8WeightOnlyQuantizedLinearWeight,
+from torchao.quantization.subclass import (
    QuantizedLinearWeightBase,


didn't we remove this class in this PR? Seems we need to delete this import completely?

Done, thanks for pointing it out

andrewor14 · 2025-10-07T16:10:50Z

torchao/quantization/utils.py

    return y


-def dynamically_quantize_per_channel(x, quant_min, quant_max, target_dtype):


Hey @namgyu-youn I don't think this is part of the scope, can you add this function back? The issue #2745 is only referring to everything in subclass.py, dynamic_quant.py, and weight_only.py

Reverted; it was misunderstanding while checking api structure.

drop old quantization flows

56d16dc

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 2, 2025

namgyu-youn mentioned this pull request Oct 2, 2025

Deprecate and remove subclass.py, dynamic_quant.py, and weight_only.py #2745

Open

6 tasks

andrewor14 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Oct 2, 2025

andrewor14 self-requested a review October 2, 2025 15:58

remove old quantization flows

964b7e4

namgyu-youn commented Oct 2, 2025

View reviewed changes

namgyu-youn added 2 commits October 3, 2025 03:19

inline quantization API in benchmarks

fbb2f2b

Merge branch 'main' into deprecate-old-apis

04ea5f8

andrewor14 reviewed Oct 7, 2025

View reviewed changes

andrewor14 requested a review from jainapurva October 7, 2025 16:13

drop old apis

c09839c

namgyu-youn requested a review from andrewor14 October 7, 2025 16:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Drop old quantization flows #3115

Drop old quantization flows #3115

namgyu-youn commented Oct 2, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 2, 2025 •

edited

Loading

Uh oh!

namgyu-youn Oct 2, 2025 •

edited

Loading

Uh oh!

andrewor14 Oct 7, 2025

Uh oh!

andrewor14 Oct 7, 2025

Uh oh!

namgyu-youn Oct 2, 2025

Uh oh!

andrewor14 left a comment

Uh oh!

andrewor14 Oct 7, 2025

Uh oh!

andrewor14 Oct 7, 2025

Uh oh!

andrewor14 Oct 7, 2025

Uh oh!

namgyu-youn Oct 7, 2025 •

edited

Loading

Uh oh!

andrewor14 Oct 7, 2025

Uh oh!

namgyu-youn Oct 7, 2025

Uh oh!

andrewor14 Oct 7, 2025

Uh oh!

namgyu-youn Oct 7, 2025

Uh oh!

Uh oh!

		@@ -1,266 +0,0 @@
		# Copyright (c) Meta Platforms, Inc. and affiliates.

		return y


		def dynamically_quantize_per_channel(x, quant_min, quant_max, target_dtype):

Drop old quantization flows #3115

Are you sure you want to change the base?

Drop old quantization flows #3115

Conversation

namgyu-youn commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3115

❌ 1 New Failure, 6 Pending

Uh oh!

namgyu-youn Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andrewor14 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

namgyu-youn Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

namgyu-youn commented Oct 2, 2025 •

edited

Loading

pytorch-bot bot commented Oct 2, 2025 •

edited

Loading

namgyu-youn Oct 2, 2025 •

edited

Loading

namgyu-youn Oct 7, 2025 •

edited

Loading